fix(eval): preserve custom eval metadata by he-yufeng · Pull Request #5922 · google/adk-python

he-yufeng · 2026-05-31T22:39:25Z

Summary

allow evaluation models to preserve caller-provided metadata fields
add a regression test covering extra fields on SessionInput and EvalCase

Fixes #5906

To verify

PYTHONPATH=src python -m pytest tests/unittests/evaluation/test_eval_case.py -q
python -m pyink --check src/google/adk/evaluation/common.py tests/unittests/evaluation/test_eval_case.py
python -m ruff check src/google/adk/evaluation/common.py tests/unittests/evaluation/test_eval_case.py
git diff --check

he-yufeng · 2026-06-02T19:14:54Z

Rebased this branch onto the latest main.

Validation on the rebased branch:

PYTHONPATH=src .venv\Scripts\python.exe -m pytest tests\unittests\evaluation\test_eval_case.py -q
PYTHONPATH=src .venv\Scripts\python.exe -m py_compile src\google\adk\evaluation\common.py tests\unittests\evaluation\test_eval_case.py
git diff --check upstream/main...HEAD

Note: using the repo-local .venv is necessary on my Windows machine; the global conda environment has an older/incompatible google-genai install.

adk-bot · 2026-06-05T18:17:35Z

I will begin by reading the instructions for the adk-pr-analyze skill to ensure I follow the correct procedure for analyzing the pull request.I will run the Contributor License Agreement (CLA) verification script to ensure the contributor has signed the Google CLA before proceeding with any analysis of PR #5922.I will fetch the details of GitHub Pull Request #5922 using the custom fetch_github_pr tool.I will view the triage_pr.py helper script to see how it performs verification and if we can perform a equivalent, permissible check or find another way to handle CLA verification.I will fetch the status check information for PR #5922 to verify the CLA signature status of the contributor using the gh command.I will fetch the details of the linked issue #5906 using the custom fetch_github_issue tool.I will retrieve the complete line-by-line diff of PR #5922 using the gh tool.I will view the src/google/adk/evaluation/common.py file to inspect its content in our baseline workspace.I will view the unit test file tests/unittests/evaluation/test_eval_case.py in our workspace to inspect how tests are designed around evaluation cases.I will list the contents of the src/google/adk/evaluation folder to see what other files are there and how the evaluation module is structured.I will search for extra="forbid" occurrences in the codebase to see if other modules are similarly configured and whether this strict configuration is localized.I will search for the definition of SessionInput and EvalBaseModel in the codebase to find their exact locations and line numbers, enabling me to include direct links in the report.I will view the SessionInput class in src/google/adk/evaluation/eval_case.py to check its structure and verify its inheritance.I will write the complete and in-depth PR Analysis Report of PR #5922 to the artifact directory. This report details the change objective, the impact and execution flow, its structural and architectural value, and evaluates the implementation against Google ADK's principles.I have successfully completed the PR Analysis for PR #5922.

The detailed PR Analysis Report is now available in your workspace artifacts.

Below is an executive summary of my findings:

🔍 ADK Pull Request Analysis: PR #5922

Title: fix(eval): preserve custom eval metadata
Author: @he-yufeng
Status: open
Impact: 29 additions, 1 deletions across 2 files

Executive Summary

Core Objective: Changes Pydantic's EvalBaseModel configuration in common.py from extra="forbid" to extra="allow". This ensures all children evaluation structures (like SessionInput and EvalCase) preserve custom metadata parameters instead of breaking or throwing validation errors.
Justification & Value: [Justified Fix] — Essential usability enhancement that removes a significant developer bottleneck by allowing custom evaluation metrics correlation tags (e.g. tier, category) to pass cleanly through evaluation runs.
Alignment with Principles: [Pass] — Code styling, module structures, and tests conform perfectly to ADK standards.
Recommendation: [Approve with Nits] — Highly polished contribution. Recommending merging immediately, following a minor nit to append -> None onto the return type annotation in the newly added test suite function.

Key Open Questions or Decisions

Review/Merging: I highly recommend approving and merging this change. Do you want me to write a comment on the PR on GitHub containing this feedback, or would you like to review the generated [PR Analysis Report](file:///home/runner/.gemini/antigravity/brain/034a60480025faad4acccfb4432831c4/

rohityan · 2026-06-05T18:26:19Z

Hi @he-yufeng , Thank you for your contribution! We appreciate you taking the time to submit this pull request.

he-yufeng · 2026-06-06T18:40:18Z

Rebased this branch onto current main; the PR diff is back to the intended two files.

Validated locally after the rebase:

python -m py_compile src/google/adk/evaluation/common.py tests/unittests/evaluation/test_eval_case.py
PYTHONPATH=src C:\dev\GITHUB-clean\adk-python\.venv\Scripts\python.exe -m pytest -q tests/unittests/evaluation/test_eval_case.py -k metadata --basetemp .tmp\pytest-5922 -p no:cacheprovider (1 passed)
git diff --check upstream/main..HEAD

I could not run local pyink from the reused ADK venv because that environment does not include pyink; CI will rerun the full pre-commit suite.

adk-bot · 2026-06-06T18:41:33Z

I have successfully analyzed Google ADK Pull Request PR #5922.

Google CLA Validation: Verified successfully (Status: Signed & Active).
Scope & Correctness: Fully validated against the baseline workspace and the linked issue Issue #5906.

The detailed PR Analysis Report is now available in your workspace artifacts.

Summary of Work Done

CLA Verification: Verified the contributor's CLA validation success status.
Context Retrieval: Fetched PR payload and linked Issue #5906 details.
Workspace inspection: Searched and reviewed the baseline configurations of EvalBaseModel in common.py, subclasses under eval_case.py, and the regression tests suite test_eval_case.py.
Report Authoring: Penned down a thorough aesthetic and architectural assessment report detailing design alternatives and standard compliance.

Key Recommendation

Decision: Approve & Merge
Since the contribution is extremely elegant and uses the standard Pydantic configuration (extra="allow") to gracefully preserve user-supplied tags on evaluation base structures, we recommend merging this PR. Only a minor, non-blocking suggestion is highlighted in the report regarding test function formatting conventions.

adk-bot added the eval [Component] This issue is related to evaluation label May 31, 2026

surajksharma07 mentioned this pull request Jun 1, 2026

EvalBaseModel extra='forbid' rejects custom fields in session_input and eval cases #5906

Open

rohityan self-assigned this Jun 1, 2026

he-yufeng force-pushed the fix/eval-extra-metadata-fresh branch 2 times, most recently from 9516053 to 1d17dd7 Compare June 2, 2026 19:14

rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label Jun 5, 2026

rohityan removed the request clarification [Status] The maintainer need clarification or more information from the author label Jun 5, 2026

fix(eval): preserve custom eval metadata

4c60c7a

he-yufeng force-pushed the fix/eval-extra-metadata-fresh branch from 0d31501 to 4c60c7a Compare June 6, 2026 18:40

he-yufeng mentioned this pull request Jun 6, 2026

fix(eval): preserve custom eval metadata #5908

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(eval): preserve custom eval metadata#5922

fix(eval): preserve custom eval metadata#5922
he-yufeng wants to merge 1 commit into
google:mainfrom
he-yufeng:fix/eval-extra-metadata-fresh

he-yufeng commented May 31, 2026

Uh oh!

he-yufeng commented Jun 2, 2026

Uh oh!

adk-bot commented Jun 5, 2026

Uh oh!

rohityan commented Jun 5, 2026 •

edited

Loading

Uh oh!

he-yufeng commented Jun 6, 2026

Uh oh!

adk-bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

he-yufeng commented May 31, 2026

Summary

To verify

Uh oh!

he-yufeng commented Jun 2, 2026

Uh oh!

adk-bot commented Jun 5, 2026

🔍 ADK Pull Request Analysis: PR #5922

Executive Summary

Key Open Questions or Decisions

Uh oh!

rohityan commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

he-yufeng commented Jun 6, 2026

Uh oh!

adk-bot commented Jun 6, 2026

Summary of Work Done

Key Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rohityan commented Jun 5, 2026 •

edited

Loading